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@ An apparatus and method for verifying a code displayed on a container surface against a target code, the 
method comprising the steps of: capturing an image of the container surface carrying the displayed code; 
digitising the captured image to form a pixel array In which each pixel in the array has a respective quantisation 
level; scanning the pixel array to detect potential characters; selecting and grouping together potential characters 
of the displayed code; determining from the pixel values of the potential characters a set of recognised 
characters constituting a recognised code; comparing the target code with the recognised code; and either 
verifying or rejecting the recognised code in dependence upon the results of the comparison between the target 
code and the recognised code. 
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The present Invention relates to a method and apparatus for verifying a container code and more 
specifically for verifying an Identification code on a cargo container against a target code. 

Cargo containers each have a unique Identification code (ID code) which Is painted or applied by other 
means onto the sides of the container surface. The ID code must be read and verified against a computer 
5 record whenever a container is towed In or out of a container yard in a port. This ensures that the correct 
truck is towing the correct container, or containers, through a gate of a cargo terminal. Currently, the code is 
read and verified manually as the truck passes through the gate. The truck must be stopped at the gate 
whilst each ID code on each container being towed by the truck is checked. Human inspection of the ID 
code Is slow and prone to oversights and errors. Each gate must be manned by at least one operator thus 
10 necessitating significant man power. 

One object of the present invention is to provide a container number recognition apparatus which 
automates the present verification process. 

Another object of the present invention Is to provide a container number recognition apparatus which 
improves the efficiency of the verification process and to require the presence of only one gate operator to 
75 man multiple gates at the port. 

Accordingly, in one aspect, the present invention provides a method of verifying a code displayed on a 
container surface against a target code comprising the steps of: capturing an image of the container surface 
carrying the displayed code; digitising the captured image to form a pixel array In which each pixel in the 
anray has a respective quantisation level; scanning the pixel array to detect potential characters: selecting 
20 and grouping together potential characters of the displayed code; determining from the pixel values of the 
potential characters a set of recognised characters constituting a recognised code; comparing the target 
code with the recognised code; and either verifying or rejecting the recognised code in dependence upon 
the results of the comparison between the target code and the recognised code. 

In another aspect, the present invention provides a neural network for analysing an array comprising a 
25 plurality of elements, the array being divided into a plurality of windows, the neural network comprising a 
plurality of input nodes defining an input layer and at least one output node defining an output layer, one or 
more intermediate layers being disposed between the input layer and the output layer, in which neural 
network the input layer is divided into a plurality of discrete areas each corresponding to a respective 
window; the value of each element in a window represents the input for a corresponding input node in an 
30 area of the input layer corresponding to the respective window; each node within the network computes an 
output in accordance with a predetermined function; the outputs of the nodes in each area of the input layer 
are connected to specified nodes in a first intermediate layer which are not connected to the outputs of the 
nodes in another area of the input layer; the outputs of the nodes in the first and subsequent intermediate 
layers being connected to the inputs of the nodes in the immediately following layer; and the output nodes 
35 of the last intemiediate layer are connected to the inputs of the output node or nodes of the output layer. 

In a further aspect, the present invention provides a container code verification apparatus comprising: 
image capture means to capture an image of a container surface carrying a displayed code; data 
processing means to form a pixel array from the captured image; scanning means to scan the pixel array; 
detection means to detect potential characters; selection and grouping means to select and group together 
40 potential characters of the displayed codes; decision means to detemiine from the pixel values of the 
potential characters a set of recognised characters constituting a recognised code; comparison means to 
compare the target code with the recognised code; and verification means to verify or reject the recognised 
code in dependence upon the results of the comparison between the target code and the recognised code. 
In order that the present invention may be more readily understood, an embodiment thereof is now 
45 described, by way of example, with reference to the accompanying drawings in which : 

Figures la and lb are examples of container identification codes which have been highlighted, the ID 
code in Figure 1a in a light colour on a dark background and the ID code of Rgure lb in a dark colour 
on a light background; 

Figure 2 is a diagrammatic representation of one embodiment of the present invention; 
50 Figure 3 illustrates a tmck in position at a gate installed with an embodiment of the present invention; 

Rgure 4 illustrates vertical and horizontal segments in a pixel array in accordance with one embodiment 
of the present invention; 

Figure 5 illustrates a right boundary vertical segment in accordance with one embodiment of the present 
invention; 

55 Figure 6 is a schematic representation of a multi-layer neural network structure as Incorporated in one 

embodiment of the present invention; 

Rgure 7(i) illustrates a window-based neural network architecture for use with one embodiment of the 
present invention; 
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Rgure 7(ii) illustrates a conventional neural network architecture; 

Rgure 8 Illustrates an example of a matching problem; 

Rgure 9 illustrates a matching path found for the example of Rgure 8; and 

Rgure 10 illustrates the local continuity restraints applicable to an embodiment of the present invention. 

5 A cargo container, such as a standard ISO container, comprises a rigid metal box container. Each 
container is identified by a unique ID code comprising a string of alphanumeric characters which is usually 
painted onto at least one surface of the container. Other information is also applied to the container surface, 
such as gross weight, net weight, country of origin and the like. Thus, the ID code is located amongst other 
characters containing the other information. The containers themselves, due to the environment in which 

10 they are used, often become marked, dirty or dented. The presence of conxjgations, structural bars, smears 
and other noise may distort the characters. Thus, the variation in character and background intensity and 
the inherent lack of adequate contrast pose problems for a reliable method and apparatus for ID code 
character extraction, recognition and verification. The intensity and contrast of the characters and the 
background varies with respect to the illumination of the container surface in different conditions, such as 

75 during daylight, night time and cloud cover. Also, the characters of the ID code may be presented to the 
recognition apparatus at an angle, thus resulting in characters which are skewed as shown in Rgures la 
and 1b which also illustrate other characters In addition to the ID code which are present on each container. 

Referring to Rgure 2, the apparatus embodying the invention comprises a plurality of cameras 1 , 2 and 
3. three in this case, a multiplexer 4 which transmits the signals from the cameras 1. 2 and 3 to a transputer 

20 network 5 via a BNC connection 6. The flow of data from the transputer network 5 is connected to and 
controlled by a host computer 7 via an AT bus 8. The host computer 7 may be, for example a PC-AT 386 
microcomputer which is in turn connected by an RS-232 serial link 9 to a gate computer 10 which stores 
container information obtained from a mainframe computer (not shown). 

When a truck loaded with a container approaches a gate and stops at the gate (as shown in Rgure 3) , 

25 the gate operator initiates the verification process on the host PC 7. Three closed circuit TV cameras 1 .2 
and 3 are used to capture images of the container 8. Each image comprises an array of 720 x 512 pixels, 
each pixel being capable of 256 grey levels. The images are sent via the multiplexer 4 to the transputer 
network 5. The transputer network 5 comprises a monochrome frame grabber 1 1 (Rgure 20) which acts as 
a master transputer and controls the flow of execution commands to a number of worker transputers 12,13 

30 AND 14 (in this case three), and a root transputer 15 which is connected to the host PC 7 and allows 
communication between the host PC 7 and the other transputers. The host PC 7 is responsible for file 
saving, input and output via a keyboard and VDU and for communication with the gate mainframe computer 
which stores a record of container information obtained from a mainframe computer which can be 
interrogated by the host PC 7. The host PC 7 also displays the results of the verification to the operator. 

35 When a truck with a container 8. or containers, arrives at the gate, the gate mainframe computer sends 
information about the container 8 expected at the gate, Including which container ID code 16 Is expected, to 
the gate PC 10. 

The appropriate camera 1 ,2 and 3 is chosen to focus on the rear of the container 8 and the frame grabber 
contrast setting is adjusted to obtain a good level of contrast between the characters on the container 8 and 

40 the background. A segmentation process then locates and extracts a bounding box for each ID character of 
the ID code 16 from the container image. The extracted bounding box is normalised to a standard size and 
a character pixel map of this information is passed to a network character recogniser which calculates the 
alphanumeric character most likely to be represented by the character pixel map. A proposed ID character 
code comprising the recognised characters derived from the image information extracted from the container 

45 8 is matched with the expected ID code provided by the gate computer 10 and the two codes are 
compared. The comparison of the two ID codes produces a confidence measure which indicates the degree 
to which the two ID codes match. The operator can then decide to hold back or allow the truck to pass 
through depending on the confidence measure determined by the matching comparison between the two ID 
codes. If the confidence measure is over a pre-set threshold then the truck may continue, if not. then the 

50 truck Is held at the gate. 

The container number recognition system can handle the entire range of container sizes. This includes 
containers of length 20 ft, 40 ft and 45 ft (6.1m. 12.2m and 13.7m respectively) as well as heights varying 
from 8 ft to 9.5 ft (2.4m and 2.9m respectively). Also, the length of the chassis on which the container, or 
containers, are placed can be 20 ft, 40 ft or 45 ft (6.1m. 12.2m and 13.7m respectively). Therefore, the 

55 number and location of the cameras is such that the container images captured are of adequate quality to 
enable the ID code characters to be resolved. 

The operation of the apparatus is now discussed in more detail: 

During daytime operation the natural level of Illumination of containers 8 at the gate is normally 
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sufficient and as such no external lighting is necessary. However, In operation during the evening and the 
night, fluorescent lights 17, 18 are used to Illuminate the containers 8. As shown in Figure 3, a rear light 17 
Is located behind the container 8 facing the gate house 19 to Improve the contrast of the image of the 
container 8. against the background. A frontal illumination 18 source Is located facing the container 8 to 

5 improve the Illumination of features on the surface of the container such as the ID code characters 16. If the 
characters which are to be extracted from the surface of the container are to be resolvable from the 
background of the container surface then it has been found that there must be an Intensity contrast between 
the background and the subject of at least 15 - 20% of the overall image intensity. The closed circuit TV 
cameras 1 ,2 and 3 have an auto-Iris capability to compensate for small variations in illumination. 

10 Upon receipt of a target ID code from the gate computer 10 the appropriate camera 1.2 or 3 Is chosen 
to focus on the rear of the container 8. The selected camera 1 ,2 or 3 triggers the frame grabber 1 1 which 
applies an algorithm to adjust the contrast setting of the frame grabber 1 1 so that the mean grey level for a 
predetermined image window 20 defined by the camera's field of view is at an optimal value. The algorithm 
uses Newton's method to compute the new value of the contrast setting according to equation (i): 

75 

i) new contrast ' old contrast + slope (mean- optimum mean) slope ' A contrast/A mean at last 

iteration 

The delta operator (A) indicates the change In contrast or mean grey level respectively between the last 

20 iteration and the current one. The optimal mean value and the slope have been experimentally determined. 
Having determined the contrast setting, the camera 1 .2 or 3 captures an image which Is sent via the 
multiplexer 4 to the frame grabber 1 1 where the image is processed further. The image consists of an array 
of 720 X 512 pixels, each pixel being capable of a quantisation level of between 0 and 255. 

To determine the location of all potential characters within the image and to establish a surrounding 

25 rectangle or bounding box around each potential character, the captured Image Is scanned by the frame 
grabber one column at a time. The grey level of each pixel in each column is quantised and horizontal and 
vertical segments are created based on the quantised value. A horizontal segment (see Figure 4) is defined 
as a part of a row of the image in a column for which the adjacent pixels (in the horizontal direction) have 
the same quantised value. Similarly, a vertical segment (see Figure 5) is a part of a column for which 

30 adjacent pixels (in the vertical direction) have the same quantised value. Therefore, the image has now 
been divided into vertical and horizontal segments consisting of a number of pixels which are associated by 
their respective quantised values and the image can now be scanned segment by segment thus saving 
both time and memory space. 

The vertical segments are scanned and ail segments which are on the right hand boundary of a 

35 character are detected. These are defined as right boundary vertical segments (see Figure 4) and each of 
these segments must satisfy two conditions : there must a corresponding vertical segment (V2) to the left of 
the vertical segment (V1) of which at least a part must be adjacent to the first segment (VI) and the 
quantised value of the corresponding vertical segment (V2) must be different from that of the right boundary 
vertical segment (VI). 

40 The threshold or extent of each potential character is based on the pixel grey level at the boundary 
defined by the boundary segments. If the background colour of the image is white, then the threshold Is 
selected as the lowest grey level. If the background colour of the image is black, then the threshold is 
selected as the highest grey level. The threshold is used to determine if the potential character is 
connected to any other components which may belong to the same character. An example of connected 

45 horizontal segments Is shown in Figure 4 which shows three connected horizontal components hi to h3. By 
limiting the character size to a predetermined level, the number of connected components will be reduced. 
Each character is surrounded by a bounding box which defines the spatial extent of a character. 

This process creates a character pixel map comprising horizontal and vertical segments. Features 
based on the aspect ratio and the histogram of the character pixel map are computed. If the value of the 

50 feature defined by the character pixel map Is not in an acceptable range which has been defined by 
heuristic rules, then the feature Is eliminated as noise. 

The resultant features or characters which are in the acceptable range are processed further to define 
specific groupings of characters. This process determines the location of all potential characters in the 
image and finds the surrounding rectangle (bounding box) for each potential character. The characters are 

55 thus grouped together by proximity and height based on their relative spatial coordinates in order to retain 
only those characters which should belong to the ID code. Rgures la and lb show the ID code characters 
in one group which has been highlighted. The other characters are also grouped together, but not 
highlighted. 
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Thus, having established the location of all the potential characters and surrounded each character with 
a boundary box, all the boundary boxes are sorted into horizontal (x) and vertical (y) directions. This allows 
the characters to be further sorted into groups of horizontal rows. The ID code usually occurs in one, two or 
three horizontal rows. Any single isolated characters are discarded at this stage. 

5 The polarity of the characters in each group is determined, i.e. whether the characters are white on 
black or black on white. This information is determined based on the uniformity of the height and width of 
the characters in the group assuming that there are more uniform characters of one polarity than of another. 
This information is determined by using a belief or uniformity measure as disclosed in the journal of 
Mathematical Analysis and Applications 65:531-542, Nguyen HT (1978). "On random sets and belief 

70 functions". 

The grouped characters are selected starting from the top-most row of the image. Those groups which 
have a belief uniformity measure atx)ve a predetermined level are chosen. The number of character groups 
selected is guided by the total number of characters present in the target ID code provided by the 
mainframe computer as well as the uniformity measure. 

75 At this stage, groups of potential characters have been identified from the image. However, these 
extracted characters may be skewed, of different sizes and on different backgrounds. Therefore, a 
normalisation step is carried out which firstly converts the pixel maps of each character to a white character 
on a black background and secondly standardises the size of each character pixel map using scale factors 
computed in both x and y directions. Usually, this results in a linear compression since the size of the 

20 extracted characters is larger than the standard size. 

The normalised grey-scale character pixel map Is presented as the Input to a neural network. The 
quantised grey level of each pixel is normalised to a value between 0 and 1. The grey level value is not 
binarlsed to either 0 or 1 as the choice of the binarisation threshold would influence the shape of the 
character defined by the pixel map. This may result In portions of a character being artificially connected or 

25 broken. 

The neural network Is used to recognise the patterns defined by the standard size characters. This 
method is utilised because It offers good tolerance to noise and deformations in the input characters. 
Specifically a multi-layer feed-forward window-based neural network model is used. The general architec- 
ture of the neural network is shown in Figure 6. 

30 The neural network consists of an input layer, one or more hidden intermediate layers and an output 
layer. Defined areas or windows of the normalised grey level character pixel map are used as the input for 
the neural network. In one embodiment, shown in Rgure 7, two windows are used which are defined by the 
upper left and lower right co-ordinates of two rectangles of equivalent size. Together, the two rectangles 
bound the entire character pixel map. The windows are fed to the input layer of the neural network and 

35 each pixel within the boundaries of each window constitutes an input node of the input layer. 
Each node of the network computes the following function: 

N 

^ ii) = f (E Wi,j - 

where 

45 yi = output activation value of node i 
X] = j**" signal input to node i 
W|j = connection weight from node j to node i 
Oi = bias 

In the above equation. f{x) is a monotonic increasing, differentiable function, the output of which is 

50 bounded between 0 and 1. 

The input nodes of the input layer are connected to specified ones of the first hidden intermediate 
nodes of the first hidden Intermediate layer. Thus, input nodes of the first window are connected to a first 
set of specified Intermediate nodes and Input nodes of the second window are connected to a second set of 
specified nodes. In this manner the input nodes associated with one window are fully connected to specified 

55 hidden intermediate nodes in the first layer but not to hidden intermediate nodes in the first layer associated 
with other windows. The two sets of this network architecture Is shown in Rgure 7(i) together with a 
conventional network as shown in Rgure 7(ii). For subsequent layers after the input layer and the first 
Intermediate layer, the nodes of consecutive layers are fully connected. 
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In a particular embodiment of the invention only the following characters need to be recognized: A to Z 
and 0 to 9, I.e. a total of 36 characters, 26 letters and 10 numerals. Therefore, the output layer would 
consist of 36 nodes each representing a particular character. 

Using a window-based neural network of this type allows windows to be strategically located In the 

5 character pixel map to discriminate between confusing characters and Improve the accuracy with which the 
characters are classified. Also, compared to the conventional fully connected network shown in Rgure 7(ii), 
less synaptic connections are required between the input and first intermediate layers thus reducing 
processing time and memory requirements. Experimental results have shown that the time required to train 
such a system can be substantially reduced yet the recognition performance Is Improved slightly. 

70 The network is trained using the popular sigmoidal function: 



f (X) = 1 

1 + 6-'*^ 

and the network is operated using an approximation to the training function: 

f (X) = T 4- X 4- I xl 
2(T + 1 xl ) 

25 In both cases. T Is a parameter used to control the non-linearity. Both functions behave similarly in that 
they are both bounded between 0 and 1 , are monotonically increasing and differentiable. Moreover, when T 
approaches 0, both become step functions. When T approaches », both approximate to a horizontal line 
passing through f(x) = 0.5. Because the second function requires more iterations to attain a similar level of 
recognition performance compared to the first function, the first function is used for training purposes. 

30 However, after training, the second function replaces the first function without any degradation of recognition 
performance. 

In a preferred embodiment, the neural network does not attempt to distinguish between "0" and "0" or 
"1" and "1" because some container companies use identical character fonts for both characters. However, 
the matching procedure which is to be discussed below can resolve any ambiguity tiiat arises due to this 
35 confusion. 

As previously discussed, the system is intended to verify the ID code of a particular container and this 
is finally achieved by a last processing step which compares the character string output from the neural 
network with the target ID code character string sent from the mainframe computer. However, because 
other processing steps are Involved which may introduce errors, the two character strings may not be 

40 simply compared. For example. In the segmentation step, ID code characters may be discarded as noise 
resulting in the omission of a character from the recognised character string or, noise such as smears or 
marks on the container surface may be treated as ID code characters resulting in characters being inserted 
into the recognised character string. The segmentation step may determine a character boundary in- 
correctly and thereby split a single character into two. Also, the neural network may recognise a character 

45 incorrectly thus substituting a particular character witii an enroneous character. It can clearly be seen that 
the recognised character string may be potentially erroneous and so a matching step is Included to 
establish the best match between the recognised character string and the target ID code string. 

Referring to Rgure 8, the top row of characters represents the target ID string and the second row 
represents the character string achieved after the segmentation step. The neural network outputs a range of 

50 scores, each of which represents the degree of similarity between the Input pixel map and the 36 character 
output classes of the neural network (i.e. the degree of recognition for each character class). Rows 3 and 4 
of Rgure 8 show the best and second best scores respectively. For each entry in these rows, the character 
denotes the recognized code while the number in parenthesis indicates the corresponding score. As shown 
in the example, the neural network commits two substitution errors since the best score for the letter "U** In 

55 the target string Is given as "0" (0.7) and tiie target character of "2" is given by the neural network as "7"- 
(0.6). In this example, the segmentation step has also introduced two errors, the addition of the character 
in response, presumably, to some noise and has omitted the character "0" from the target ID code 
character string. The procedure described below has been designed to resolve tiiese ambiguities in both 
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the recognition and segmentation steps by converting the problem of verifying the target ID code character 
string Into a two dimensional path searching problem, as shown In Figure 9 which indicates the matching 
path found for the example of Figure 8, the goal of which Is to find the optimal matching path. 

in the preferred embodiment, the optimal path is found by a technique known as Dynamic Time 

5 Warping as discussed in Itakura F. (1975) **Mlnimum prediction residual principle applied to speech 
recognition", IEEE Transactions on Acoustics, Speech and Signal Processing 23:67-72 and in Satoe H, 
Chiba S (1978) "Dynamic programming algorithm optimisation for spoken word recognition", IEEE Transac- 
tions on Acoustics Speech and Signal Processing 26:43-49. In this embodiment local continuity constraints 
as shown in Figure 10 are defined to restrict the search area. The constraints specify that to reach point (/. 

10 y), five local paths are possible (paths 1-5). Under normal circumstances i.e. no Insertion or deletion of 
characters at this point, path 3 should be taken. If path 2 is chosen, the algorithm suggests that the (/ - lY^ 
character in the recognition string is an insertion error. Likewise, if path 4 Is selected, it indicates that there 
is an omission between the (/ - If^ and position. Path 1 and path 5 are included to take care of 
boundary conditions. Path 1 caters for extraneous characters positioned before or after the target ID string 

75 whereas path 5 is needed for cases where the first or last few ID characters are missing. Thus paths 1 and 
5 are invoked only near the beginning or the end of the search area. 

Once the local continuity constraints are specified, the global search area Is determined. To find the 
best path, a cumulative matching score Dij Is maximised. This score can be recursively computed as 
follows: 
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where S is the neural network response for the target character j, p is the penalty for an insertion error and 
q for an omission error. The penalties are set to 0.2 In this embodiment. 
The basic algorithm to find the optimal matching path is as follows: 

Let I be the number of characters in the recognized string and J the number of characters In the target 
35 ID string 

Step 1: 

f or j = 1 to J 
40 initialize Do,j to a negative large number 
Do,o = 0.0 

Step 2: 
for I = 1 to I 
45 for j = 1 to J + 1 

compute D|j according to the equation (V) 
register the local path taken to an array pathg 



Step 3: 



Starting from j = J + 1 
while (j>=0) 

trace back the pathij array and register the matching character in the recognition string. 

At the end of Step 3. the optimal matching pairs between the target and the recognition string are 
55 determined. The result could be directly reported to the operator. However, to make the system easier for 
the operator to use, It Is better to aggregate all the types of errors into a single perfomnance index. This Is 
achieved by defining a confidence measure as discussed below. 
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The confidence measure must be able to reflect the actual accuracy of the entire system without 
penalising for insignificant mistakes. For example, insertion errors are not serious mistakes though they 
should be avoided as far as possible. As for character recognition, although the highest score produced by 
the recognizer may not correspond to the correct character, characters having the second or third highest 
5 scores may correspond. Further, the difference between this score and the highest score may t>e small. To 
take care of such situations, a penalty Is introduced which depends on the extent to which the character has 
been mis>recognized. 

Thus, having considered all these factors, the performance Index or confidence measure Is computed 
as follows: 



1 N score (i) 
CM = - 2 



J jsX best score <i) 



where N is the number of matching character pairs found by the dynamic time warping procedure, score(i) 
20 is the similarity score of the I*** character. If there Is any omission enror, then N will be less than J. If 
character I is correctly recognised, then score(i) is the best score and equals best score and the 
contribution of that character to the sum is 1 . It can be seen that if all the characters are segmented and 
recognized correctly, then the sum of the N elements is J and hence the confidence measure, CM = J/J = 
1 . For Insertion errors which are detected by the dynamic time warping process, there will be no character 
25 in the ID string associated with them. Hence these extra characters will not enter Into the computation. 

Thus, the final result of the entire process is the CM score. Its value lies between 0 and 1 . The operator 
can therefore set a specific confidence measure threshold so that the container ID code verification is 
deemed successful if the CM score Is higher than the threshold. Otherwise, the operator Is notified and can 
investigate the matter. 

30 In the preferred embodiment, the processes described above have been implemented on a five 
transputer network which is hosted on a PC-AT microcomputer. The code was written in Parallel C Ver. 
2.1.1. using the 3L Compiler. The character extraction and recognition processes were parallelised while the 
normalization and matching procedures were handled by the master transputer. In order to implement the 
character segmentation process in parallel the entire Image is divided into five equal vertical strips. Each 

35 processor is allocated one image strip. Four of the vertical strips overlap in the x direction by an amount 
equal to the maximum allowable width of a character minus 1. This is to ensure that every character is 
extracted in full by a transputer. The master transputer distributes the image strips to the root and the three 
workers. When all the processors have finished and sent back the potential bounding boxes the master 
proceeds with the grouping phase of the character extraction. The recognition process takes place in 

40 parallel by distributing the character maps one by one to the worker processors. The master transputer 
controls and coordinates the sending of pixel maps and receiving of recognition results. 

While determining the performance of the system, the character extraction and recognition processes 
were separately evaluated in addition to testing the accuracy of the overall system. This was done in order 
to test the accuracy of the individual processes, since the errors produced during character extraction are 

45 propagated to the recognition phase. The segmentation and character extraction process was tested on 191 
character images under varying illumination conditions. The results of computer extraction i.e. the bounding 
boxes of characters, were compared to the coordinates of characters extracted manually. Table 1 shows 
that the number of characters correctly segmented out comprises 91 .365% of the total characters present in 
the images. 

50 A larger database of 441 character Images was used to evaluate the recognition process. The statistical 
distribution of characters was found to be uneven. For example, characters 'Q* and V do not occur at all 
while 'U' appears very frequently. On the other hand, since the sample size Is sufficiently large, such 
statistics reflect the true probability of occurrence of each individual character during actual operation. In the 
recognition experiments, the database is partitioned into two sets: DS1 for training and DS2 for testing. DSl 

55 consists of 221 Images and 2231 alphanumeric characters, while DS2 has 2167 characters In the remaining 
220 Images. There are roughly twice as many numeral characters as alphabetic characters In t)oth sets. 

The estimated verification time for each container image is approximately 13.5 seconds. This includes 
the time elapsed from when the container information is received from the Gate PC till the recognition 
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results and confidence measure are sent back to the Gate PC. Figures 1 (a) and (b) show examples of 
container codes extracted by the above described method. The extracted characters and their circumscrib- 
ing rectangles are shown after the thresholding process. 

Modifications are envisaged to improve the robustness of the character extraction process and increase 
5 its accuracy when the container image is noisy. Various approaches are being considered to recover 
characters with poor contrast, merged characters and partially obliterated characters. 

The features disclosed in the foregoing description in the following claims and/or in the accompanying 
drawings may, both separately and In any combination thereof, be material for realising the invention In 
diverse forms thereof. 

10 

Claims 

1. A method of verifying a code (16) displayed on a container (8) surface against a target code 
characterised by the steps of: capturing an image of the container (8) surface carrying the displayed 

75 code: digitising the captured image to fonm a pixel array in which each pixel In the array has a 
respective quantisation level; scanning the pixel array to detect potential characters; selecting and 
grouping together potential characters of the displayed code; determining from the pixel values of the 
potential characters a set of recognised characters constituting a recognised code; comparing the 
target code with the recognised code; and either verifying or rejecting the recognised code in 

20 dependence upon the results of the comparison between the target code and the recognised code. 

2. A method according to Claim 1 , wherein a neural network determines the set of recognised characters. 

3. A method according to Claim 2, wherein the quantisation levels of the pixels in the array represent the 
25 input values for input nodes of an input layer of the neural network. 

4. A method according to Claim 3. wherein the input value of each input node corresponds to the 
quantisation level of a respective pixel. 

30 5. A method according to any one of Claims 2 to 4, wherein the pixel array is divided Into a plurality of 
windows. 

6. A method according to Claim 5, wherein the neural network comprises: a plurality of input nodes 
defining an input layer and at least one output node defining an output layer, one or more intermediate 

35 layers being disposed between the input layer and the output layer, the input layer being divided into a 
plurality of discrete areas each corresponding to a respective window; the pixel value of each pixel in a 
window representing the input for a corresponding input node in an area of the input layer correspond- 
ing to a respective window; each node within the network computing an output in accordance with a 
predetermined function; the outputs of the nodes in each area of the input layer being connected to 

40 specified nodes In a first intermediate layer which are not connected to the outputs of the nodes In 
another area of the input layer; the output of the nodes in the first and subsequent intermediate layers 
being connected to the inputs of the nodes in the immediately following layer; and the output nodes of 
the last intermediate layer are connected to the inputs of the output nodes or nodes of the output layer. 

45 7. A method according to any one of Claims 2 to 8, wherein the output of the neural network consists of a 
set of scores indicating the degree of recognition between the character defined by the pixel array and 
the classes of character recognisable by the neural network. 

8. A method according to any of Claims 2 to 7, wherein the detected potential character is surrounded 
50 with a bounding box. 

9. A method according to Claim 8. wherein each bounding box is divided into a plurality of windows and 
the quantisation value of the pixels contained in each window comprises the input values for a 
corresponding discrete area defined in the input layer of the neural network. 

55 

10. A method according to any one of Claims 2 to 9 in which each ode in the neural network computes the 
function: 
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N 



where 

Yi = output activation value of node i 

10 xj = j*** signal Input to node i 

W|j = connection weight from node j to node i 

Oi = bias 

11. A method according to any one of Claims 2 to 10 in which the neural network is operated using the 
75 function: 



/(X) = T I X » I x| 
2 (T +1 Xl 1 



(T +1 Xl } 

20 

12. A method according to any one of Claims 2 to 11 in which the neural network is trained using the 
function: 

25 

/(X) - 1 

1 + e 



30 

13. A method according to any one of Claims 2 to 12, wherein the neural network Is included in a 
transputer network (5). 

14. A method according to any preceding claim, wherein the potential characters are detected by 
35 identifying horizontal segments (h1...n) comprising horizontally adjacent pixels with substantially equiv- 
alent quantisation levels and vertical segments (v1 ..,n) comprising vertically adjacent pixels of substan- 
tially equivalent quantisation levels; and horizontal and vertical segments (h1...n. v1...n) which are 
connected together defining a potential character. 

40 15. A method according to any preceding claim, wherein the potential characters which are spatially 
located horizontally adjacent one another on the container surface are grouped together. 

16. A method according to Claim 15, wherein only the horizontally grouped potential characters are 
selected. 

45 

17. A method according to any preceding claim, wherein any single isolated potential characters are not 

selected. 

18. A method according to any preceding claim, wherein the polarity of the foreground and background of 
50 each potential character Is determined and converted to a standard polarity. 

19. A method according to any preceding claim, wherein the potential characters are scaled to a standard 
size. 

55 20. A method according to any preceding claim, wherein the quantisation level of each pixel is normalised 
to a value t)etween 0 and 1 . 
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21. A method according to any preceding claim, wherein the recognised code and the target code are 
compared by obtaining a confidence measure representing the degree of match between the two codes 
and comparing the confidence measure with a predetermined threshold. 

5 22. A method according to any preceding claim, comprising positioning the container (8) carrying the 
displayed code (16) adjacent an image capturing means (1. 2. 3) which services to capture the image 
of the container surface. 

23. A method according to any preceding claim, wherein the container is artificially illuminated to increase 
the image contrast. 

24. A method according to any preceding claim, wherein the image is captured by at least one closed 
circuit TV camera (1 , 2. 3) connected to a frame grabber (11), 

25. A method according to Claim 24. wherein the contrast parameter of the frame grabber (11) is 
determined by Newton's iterative method. 

26. A method according to any preceding claim, wherein the target code is obtained from a remote 
mainframe computer. 

27. A method according to any preceding claim, wherein the target code and the recognised code are 
compared by means of a dynamic programming procedure. 

2& A neural network for analysing an array comprising a plurality of elements, the array being divided into 
a plurality of windows, the neural network comprising a plurality of input nodes defining an input layer 
and at least one output node defining an output layer, one or more intermediate layers being disposed 
between the input layer and the output layer, in which neural network the input layer is divided Into a 
plurality of discrete areas each corresponding to a respective window; the value of each element in a 
window represents the input for a corresponding input node in an area of the input layer corresponding 
30 to the respective window; each node within the network computes an output in accordance with a 
predetermined function; the outputs of the nodes in each area of the input layer are connected to 
specified nodes in a first intermediate layer which are not connected to the outputs of the nodes in 
another area of the input layer; the outputs of the nodes in the first and subsequent intermediate layers 
being connected to the inputs of the nodes in the immediately following layer; and the output nodes of 
35 the last intermediate layer are connected to the inputs of the output node or nodes of the output layer. 

29. A container code verification apparatus comprising; image capture means (1, 2, 3) to capture an image 
of a container (8) surface carrying a displayed code (16); data processing means (5) to form a pixel 
an-ay from the captured image; scanning means to scan the pixel array; detection means to detect 

40 potential characters; selection and grouping means to select and group together potential characters of 
the displayed codes; decision means to determine from the pixel values of the potential characters a 
set of recognised characters constituting a recognised code; comparison means to compare the target 
code with the recognised code; and verification means to verify or reject the recognised code In 
dependence upon the results of the comparison between the target code and the recognised code. 

45 

30. A container code verification apparatus according to Claim 29 in which the decision means is a neural 
network. 

31. A container code verification apparatus according to Claim 30 in which the pixel array is divided into a 
50 plurality of windows, the neural network comprising a plurality of input nodes defining an input layer 

and at least one output node defining an output layer, one or more intermediate layers being disposed 
between the input layer and the output layer, in which neural network the input layer is divided into a 
plurality of discrete areas each corresponding to a respective window; the pixel value of each pixel in a 
window represents the input for a corresponding input node in an area of the input layer corresponding 
55 to the respective window; each node within the network computes an output in accordance with a 
predetermined function; the outputs of the nodes in each area of the input layer are connected to 
specified nodes in a first intermediate layer which are not connected to the outputs of the nodes in 
another area of the input layer; the outputs of the nodes in the first and subsequent intermediate layers 

11 
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being connected to the inputs of the nodes in the Immediately following layer; and the output nodes of 
the last Intermediate layer are connected to the inputs of the output node or nodes of the output layer. 
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